Skip to main content
Search roles

Self-evolving evaluation benchmarks research Internship

Location Cambridge, England, United Kingdom Job ID R-244908 Date posted 31/01/2026

Self-evolving evaluation benchmarks research Internship

Cambridge

AstraZeneca is a global, science-led biopharmaceutical business and its innovative medicines are used by millions of patients worldwide! AstraZeneca Summer Internships introduce you to the world of ground-breaking drug development, embedding you in highly dedicatedteams,committed to delivering life-changing medicines to patients. Our 10–12-week program is designed for undergraduate, master's, and doctoral students. We offer exciting opportunities across Research & Development, Operations, and Enabling Units (Corporate functions).

Our internships immerse students in the pharmaceutical industry, allowing the opportunity to contribute to our diverse pipeline of medicines whether in the lab or outside of it. You will feel trusted and empoweredtotakeonnewchallenges, but with all the help and guidanceyouneedtosucceed. This internship will help you developessential skills, expand your knowledge, and build a network that will set you up for future success. Youwillbe surrounded by curious,passionate, and open-minded professionals eager to learn and follow the science, fostering your growth in a truly collaborative and globalteam.

Introduction to role

Join us at the Center for ArtificialIntelligence (CAI), where we design next‑generation evaluation methods for advanced agentic AIsystemsused across scientific workflows. In this role, you will contribute to a research project focused on developing self‑evolving benchmarking frameworks, where evaluation criteria continuously adapt based on model behaviour, evidence quality, and observed failure modes. You will explore how dynamic criteria, evidence‑grounded scoring, and adversarial testing can maintain benchmark discriminative power as AI systems improve. Working closely with experts in machine learning, scientific reasoning, and evaluation science, you will gain hands‑on experience building tools that support trustworthy and scalable assessment of AIsystemsused in multi‑agent scientific workflows.

Accountabilities

As an intern, youwillbe engaged with several key responsibilities, including:
  • Developing a self-evolving benchmarking framework, incorporating dynamic rubric criteria.
  • Designing and implementing evidence-grounded scoring mechanisms, ensuring that model claims and reasoning steps are supported by verifiable traces, tool outputs, or retrieved evidence.
  • Investigating robustness and anti-gaming strategies, including adversarial testing to detect behaviours where models optimize the score without improving real-world quality.
  • Building lightweight benchmarking tools, following solid software engineering practices to ensure reproducibility, traceability, and modularity.
  • Analyzing model behaviour across multiple scientific task families, such as protocol drafting, reasoning chains, and multi-agent planning, to assess the generality of the evolving benchmark.
  • Collaborating with scientists to identify key failure modes, highvalue assessment signals, and opportunities to integrate the benchmarking framework into scientific workflows.

Essential Skills/Experience

The ideal candidate will possess the following skills and experience:

Essential:
  • Currently pursuing a PhD in computer science, machine learning, computational sciences, AI evaluation/robustness, or a related field.
  • Strong experience with machine learning and deep learning methods, ideally including evaluation or alignment related work.
  • Excellent Python programming skills; familiarity with frameworks such as PyTorch, JAX, or TensorFlow.
  • Strong analytical mindset with enthusiasm for evaluation science, reliability, and AI governance
  • Ability to work collaboratively in a teamenvironment and communicate scientific ideas effectively.
  • Must be at least 18 years of age at time of application.
  • Must have UK right-to-work status.
  • Must return to schooling at program close (candidates graduating before/during the programmes are ineligible)

Desirable:
  • Experience with benchmarking, evaluation rubrics, reinforcement learning from human/AI feedback, or model auditing.
  • Familiarity with agentic AI systems, tool using models, multi-agent workflows, or long context reasoning analysis.
  • Knowledge of rubric-based scoring, checklists, or structured evaluation frameworks.
  • Experience with adversarial testing, generative model safety, or failure mode taxonomy development.
  • Interest in applying evaluation science to scientific, biomedical, or protocol generation tasks.

This internship is a valuable opportunity to immerseyourselfincutting‑edge research on AI evaluation and robustness, with access to the necessary computational resources and mentorship from leading experts in the field. If you are ready to transform your technical knowledge into real-world applications, we encourage you to apply and become a part of ourteam driving innovation at AstraZeneca. Ourcollaborativeenvironmentisdesignedtohelpyougrowprofessionallyandpersonally,surroundedbypassionateindividualseagertomakeadifference.

AstraZeneca is where you can immerseyourselfingroundbreaking work with real patient impact.

Trusted to work on important projects, you’ll have the independence totakeonnewchallenges while receiving all the guidanceyouneedtosucceed.Ourcollaborativeenvironmentisdesignedtohelpyougrowprofessionallyandpersonally,surroundedbypassionateindividualseagertomakeadifference.

Our mission is to build an inclusive and equitable environment. We want people to feel they belong at AstraZeneca, starting with the recruitment process. We welcome and consider applications from all qualified candidates, regardless of characteristics.
We offer reasonableadjustments/accommodations to help all candidates to perform at their best. If you have a need for any reasonableadjustments/accommodations, please complete the section in the application form.

Ready to make an impact? Apply now and join us on this excitingjourney!

#Earlytalent

Date Posted

30-Jan-2026

Closing Date

13-Feb-2026

Our mission is to build an inclusive and equitable environment. We want people to feel they belong at AstraZeneca and Alexion, starting with our recruitment process. We welcome and consider applications from all qualified candidates, regardless of characteristics. We offer reasonable adjustments/accommodations to help all candidates to perform at their best. If you have a need for any adjustments/accommodations, please complete the section in the application form.

AstraZeneca embraces diversity and equality of opportunity. We are committed to building an inclusive and diverse team representing all backgrounds, with as wide a range of perspectives as possible, and harnessing industry-leading skills. We believe that the more inclusive we are, the better our work will be. We welcome and consider applications to join our team from all qualified candidates, regardless of their characteristics. We comply with all applicable laws and regulations on non-discrimination in employment (and recruitment), as well as work authorisation and employment eligibility verification requirements.

Join our Talent Network

Be the first to receive job updates and news from AstraZeneca

Sign up
Glassdoor logo Rated four stars on Glassdoor

Great culture, great work assignments, supportive management. Rotation opportunity within the company. They value inclusion and diversity.